Method and system for filtering inter-node communication in a data processing system

ABSTRACT

A method and system for communication in a system area network (SAN) data processing system are described. The SAN includes a plurality of interconnected nodes that each have at least one port for communication. To avoid communication-induced errors that may arise, for example, if multiple nodes share the same node ID, the port of a node in the SAN is marked as “fenced” to prevent transmission of packets of a first traffic type while permitting transmission of packets of a second traffic type. The marking of the port may be recorded, for example, in a configuration register of the port. While the port is fenced, only packets of other than the first traffic type are routed via the port. In one preferred embodiment, the second traffic type represents SAN configuration traffic, and the first traffic type represents non-configuration traffic. In this preferred embodiment, the marking of the port may be removed following communication of configuration traffic utilized to negotiate unique node ID throughout the SAN.

BACKGROUND OF THE INVENTION

[0001] 1. Technical Field

[0002] The present invention relates in general to networks and, inparticular, to a method and system for routing communication betweennodes of a network containing multiple nodes. Still more particularly,the present invention relates to a network in which communicationbetween nodes of a system area network (SAN) is filtered by traffictype, for example, to avoid errors arising from multiple nodes sharingthe same node identifier.

[0003] 2. Description of the Related Art

[0004] A system area network (SAN) is a collection of interconnectedprocessor and peripheral nodes that operate in concert as a dataprocessing system. The SAN topology advantageously permits large dataprocessing systems customized to the processing, storage and I/Orequirements of particular installations to be readily constructedthrough the interconnection of desired numbers of processor andperipheral nodes via backplane connections or inter-cabinet cables.

[0005] To promote reliable, efficient communication, a SAN generallylogically and physically isolates processor buses and input/output (I/O)buses in separate nodes. Communication between the processors andperipherals in a SAN must therefore be routed between nodes, forexample, utilizing unique node identifiers (IDs) assigned by firmware atsystem startup.

[0006] The conventional method of routing communication in a SAN issubject to errors if, following system startup, two SANs areinterconnected, for example, by an intercabinet cable. Errors arisebecause the node IDs assigned in each of the smaller SANs may not beunique throughout the combined system. As a result, communication in thecombined system may be routed incorrectly, possibly causing datacorruption and/or other undesirable errors.

[0007] The present invention therefore recognizes that it would beuseful and desirable to provide an improved method and system ofinter-node communication in a SAN in which all of the nodes may not haveunique node IDs.

SUMMARY OF THE INVENTION

[0008] The present invention introduces an improved method and systemfor communication in a system area network (SAN) data processing system.

[0009] The SAN includes a plurality of interconnected nodes that eachhave at least one port for communication. To avoid communication-inducederrors that may arise, for example, if multiple nodes share the samenode ID, the port of a node in the SAN is marked as “fenced” to preventtransmission of packets of a first traffic type while permittingtransmission of packets of a second traffic type. The marking of theport may be recorded, for example, in a control register for the port.While the port is fenced, only packets of other than the first traffictype are routed via the port. In one preferred embodiment, the secondtraffic type represents SAN configuration traffic, and the first traffictype represents non-configuration traffic. In this preferred embodiment,the marking of the port may be removed following communication ofconfiguration traffic utilized to negotiate unique node ID throughoutthe SAN.

[0010] All objects, features, and advantages of the present inventionwill become apparent in the following detailed written description.

BRIEF DESCRIPTION OF THE DRAWINGS

[0011] The novel features believed characteristic of the invention areset forth in the appended claims. The invention itself however, as wellas a preferred mode of use, further objects and advantages thereof, willbest be understood by reference to the following detailed description ofan illustrative embodiment when read in conjunction with theaccompanying drawings, wherein:

[0012]FIG. 1 depicts an illustrative embodiment of a SAN data processingsystem with which the present invention may advantageously be utilized;

[0013] FIGS. 2A-2E together depict an exemplary communication scenarioin which port fencing in accordance with the present invention isutilized to filter inter-node communication according to traffic type;

[0014]FIGS. 3A and 3B illustrate exemplary routing tables utilized toroute packets between nodes in accordance with a preferred embodiment;

[0015]FIG. 4 depicts a communication scenario in which errors may arisein the absence of port fencing; and

[0016]FIG. 5 illustrates one embodiment of a packet, which specifies atraffic type in a node ID field of a packet header.

DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENT

[0017] With reference now to the figures and in particular withreference to FIG. 1, there is depicted a block diagram of anillustrative embodiment of a SAN data processing system 10 with whichthe present invention may advantageously be utilized. SAN 10 includesthree nodes 16 a-16 c, which may each comprise a processor nodecontaining one or more processors, or a peripheral node containingperipheral devices such as network adapters, nonvolatile data storagedevices and adapters, I/O adapters, etc. As illustrated, nodes 16 a-16 care coupled together for communication by interconnects 40 a and 40 b.

[0018] In an exemplary embodiment in which nodes 16 a-16 c physicallyreside in different drawers of a cabinet or in different cabinets,interconnects 40 a and 40 b may be implemented as cables containing anumber of pairs of unidirectional wires that conduct differentialsignals in each direction.

[0019] In the depicted embodiment, node 16 b is a processor nodecontaining a system planar 12 coupled to one or more processor cards (inthis case processor cards 14 a-14 c). Each processor card 14 carriesfour general purpose processors 18 that each have an on-chip level one(L1) cache (not illustrated) and an associated level two (L2) cache 20that provide low latency storage for instructions and data. Theprocessors 18 on each processor card 14 are all connected to address andcontrol bus 24 and to an associated one of data buses 22 a-22 c.

[0020] System planar 12 includes a bus arbiter 26 that regulates accessto address and control bus 24 by processors 18, as well as flow controllogic 30 and network chip 32, which are each connected to address andcontrol bus 24. Flow control logic 30 is further connected todual-ported system memory 34 and data switches 28 a-28 d. Network chip32 is further connected to data switches 28 by data bus 22 d and to eachof nodes 16 a and 16 c by a respective one of interconnects 40 a and 40b.

[0021] Address transactions issued on address and control bus 24 arereceived by both flow control logic 30 and network chip 32. If anaddress transaction specifies an address assigned to system memory 34 innode 16 b, flow control logic 30 forwards the specified address tosystem memory 34 as a memory access request. Flow control logic 30 alsosupplies control signals to data switches 28 to control the flow of datapackets between processor cards 14, system memory 34, and network chip32. Address and data transactions specifying addresses that are notassigned to system memory 34 are handled by network chip 32, whichbuilds packets for the transactions and routes the packets toward theappropriate destination node(s) 16 by reference to a routing table 36.As shown in FIG. 5, each packet generally contains packet data (orpayload) 50 and a packet header 52 including a node ID field 54 thatidentifies the node ID of a recipient node.

[0022] Referring now to FIGS. 2A-2E, there is depicted a communicationscenario in which inter-node communication in a SAN is filteredutilizing port fencing in accordance with a preferred embodiment of thepresent invention. As shown in FIG. 2A, in the exemplary communicationscenario, two SANs 10 a and 10 b are provided, which may each beconstructed as described above with respect to FIG. 1. Each of nodes X,Y and Z of SAN 10 a and nodes P, Q and R of SAN 10 b contains a networkchip 32 having at least two ports (i.e., ports 0 and 1) that can beconnected to another node. As shown in FIG. 1, each port has anassociated configuration register. In SAN 10 a, port 0 of node X isconnected to port 0 of node Y, port 1 of node Y is connected to port 0of node Z. and port 1 of each of nodes X and Z is unconnected.Similarly, in SAN 10 b, port 0 of node P is connected to port 1 of NodeQ, port 0 of node Q is connected to port 0 of node R, and port 1 of eachof nodes P and R is unconnected. When powered down as shown in FIG. 2A,none of the nodes has a system-assigned node identifier (ID).

[0023] With reference now to FIG. 2B, following power on, firmware(e.g., stored within a non-volatile storage device, such as a read-onlymemory (ROM)) in each of SANs 10 a and 10 b independently initializesits SAN. As a part of the initialization process, the firmware in eachof SANs 10 a and 10 b discovers the configuration of the SAN (i.e.,which nodes are present in the SAN and the interconnections between thenodes) and assigns or negotiates a unique node ID by which each nodewill be identified in inter-node communication. The node IDs are storedin association with the associated port numbers in the routing table 36of each node in the SAN. For example, FIG. 3A illustrates an exemplaryrouting table 36 for node Y of SAN 10 a, which associates node ID 5(i.e., the ID of node X) with port 0 and associates node ID 4 (i.e., theID of node Z) with port 1. As illustrated in FIG. 3B, routing table 36of node X in SAN 10 a similarly associates node IDs 4 and 6 (i.e., theID of nodes Y and Z) with port 0 and does not associate any node ID withport 1 because it is unconnected.

[0024] Because only a limited number of node IDs are available, it ispossible, particularly in large SANs, for a node in SAN 10 a to beassigned the same node ID as a node in SAN 10 b. For example, as shownin FIG. 2B, node IDs 4, 5 and 6 are assigned to nodes in both of SANs 10a and 10 b. To prevent errors from arising in the event that SANssharing at least one common node ID are connected, the unconnected portsof SAN nodes are by default “fenced,” which, as discussed further below,means that traffic directed to such ports is filtered based upon traffictype to exclude certain traffic types and permit at least one othertraffic type. In a preferred embodiment, the fenced state of unconnectedports, if any, is recorded in configuration registers 38. As shown inFIG. 2B, port 1 of each of nodes X, Z, P and R are all marked as“fenced” (as represented by an “F”).

[0025] The importance of port fencing in accordance with the presentinvention can be seen by comparing FIGS. 2C-2E with FIG. 4. In bothcommunication scenarios, a larger SAN 10 c is formed by connecting port1 of node Z in SAN 10 a to port 1 of node P after independent power-onand initialization of SANs 10 a and 10 b. It is also assumed that node Qnormally sends packets of the format illustrated in FIG. 5 to node R viaits port 0.

[0026] As shown in FIG. 4, without port fencing, if node Q detects anerror in sending packets to node R via port 0, node Q may redirect thepackets to port 1 in attempt to reach node R via an alternative route.Upon receipt at node Z, the network chip 32 of node Z will route thepackets to node Y based upon the node ID 6 specified in the packets'node ID field 54. Because node Y is not the intended recipient of thepackets, processing of the packets by node Y may result in datacorruption, system failure, and/or other undesirable errors.

[0027] FIGS. 2C-2E illustrate how such errors can be avoided by portfencing. As described above, following startup and initialization ofSANs 1Oa and 10 b, port 1 of each of nodes X, Z, P and R is fenced byappropriate settings in the configuration registers 38 of these nodes.As shown in FIG. 2C, if SANs 10 a and 10 b are subsequently joined toform SAN 10 c by connecting port 1 of node Z and port 1 of node P with acable, the fencing of port 1 of node P prevents certain (e.g.,non-configuration) traffic from being routed to node Y. In a preferredembodiment, network chip 32 implements fencing by comparing the node IDfield 54 in a packet with a predetermined value or values (e.g., 0×FFFF)utilized to identify permitted (e.g., SAN configuration) traffic. Ifnetwork chip 32 determines that the value of node ID field 54 does notmatch the predetermined value, network chip 32 does not permit thepacket to be routed via a fenced port. Accordingly, if node Qexperiences errors in sending packets to node R and attempts to send thepackets via an alternative route, for example, through node P, networkchip 32 of node P will drop the packets rather than routing them to nodeY via port 1, thus avoiding errors that may result from multiple nodesin SAN 10 c sharing the same node ID.

[0028] Although port fencing prevents the transmission of certain (e.g.,non-configuration) packets, port fencing in accordance with the presentinvention does not present a barrier to selected traffic types, such asSAN configuration traffic. For example, as shown in FIG. 2D, followingthe interconnection of SANs 10 a and 10 b to form SAN 10 c, networkchips 32 in nodes Z and P permit configuration packets, which aredesignated by a value of 0×FFFF in node ID field 54, to flow betweenport 1 of node Z and port 1 of node P. Such configuration traffic may beutilized, for example, by SAN operating system software to negotiateunique node IDs across SAN 10 c. As shown in FIG. 2E, once unique nodeIDs have been negotiated, the configuration traffic may direct networkchips 32 to remove the fencing of port 1 of each of nodes Z and P byupdating the associated configuration registers. As a result, packets ofall traffic types may freely flow between nodes Z and P withoutgenerating errors.

[0029] As has been described, the present invention provides an improvedmethod and system for filtering communication between nodes of a systemarea network (SAN). In accordance with the present invention,communication ports not connected to another node are, by default,“fenced” to prevent one or more traffic types (e.g., non-configurationtraffic) from being routed through the ports. However, traffic of one ormore selected traffic types (e.g., configuration traffic) can be routedthrough the fenced port.

[0030] While the invention has been particularly shown and describedwith reference to a preferred embodiment, it will be understood by thoseskilled in the art that various changes in form and detail may be madetherein without departing from the spirit and scope of the invention.For example, although in a preferred embodiment, the traffic type ofpackets is identified by particular values of a node ID field of apacket header, those skilled in the art will appreciate that the traffictype can be indicated in other ways, such as by a dedicated traffic typefield in a packet header or in a separate data structure associated witha packet or packet flow.

[0031] Moreover, although aspects of the present invention have beendescribed with respect to a computer system executing software thatdirects the functions of the present invention, it should be understoodthat present invention may alternatively be implemented as a programproduct for use with a data processing system. Programs defining thefunctions of the present invention can be delivered to a data processingsystem via a variety of signal-bearing media, which include, withoutlimitation, non-rewritable storage media (e.g., CD-ROM), rewritablestorage media (e.g., a floppy diskette or hard disk drive), andcommunication media, such as digital and analog networks. It should beunderstood, therefore, that such signal-bearing media, when carrying orencoding computer readable instructions that direct the functions of thepresent invention, represent alternative embodiments of the presentinvention.

What is claimed is:
 1. A method of communication in a system areanetwork including a plurality of interconnected nodes that each have atleast one port, said method comprising: marking a port to preventtransmission to another node of packets of a first traffic type whilepermitting transmission to another node of packets of a second traffictype; and thereafter, routing via said port only packets not of saidfirst traffic type.
 2. The method of claim 1, and further comprisingstoring a routing table that associates ports with node identifiers, andwherein routing comprises routing by reference to said routing table. 3.The method of claim 2, wherein marking said port comprises marking saidport in a port configuration register.
 4. The method of claim 1, andfurther comprising determining a traffic type by reference to a packetheader.
 5. The method of claim 1, said first traffic type comprisingnon-configuration traffic and said second traffic type comprisingconfiguration traffic, wherein marking comprises marking said port toprevent transmission to another node of packets of nonconfigurationtraffic while permitting transmission to another node of packets ofconfiguration traffic.
 6. The method of claim 5, and further comprisingfollowing transmission of packets of configuration traffic, removingsaid marking of said port.
 7. The method of claim 6, and furthercomprising: in response to transmission of said packets of configurationtraffic, altering at least one node identifier used in packet routing.8. The method of claim 1, wherein marking comprises automaticallymarking in response to said port being unconnected at initialization ofthe system area network.
 9. A node for a system area network, said nodecomprising: at least one device coupled to a network chip having a portfor interconnection to another node, wherein responsive to said portbeing marked to prevent transmission of a first traffic type via saidport while permitting transmission of packets of a second traffic type,said network chip routes via said port only packets not of said firsttraffic type.
 10. The node of claim 9, and further comprising a routingtable accessible to said network chip that associates ports with nodeidentifiers, wherein said network chip routes packets by reference tosaid routing table.
 11. The node of claim 10, and further comprising aport configuration register containing said marking of said port. 12.The node of claim 9, wherein said network chip determines a traffic typeof a packet by reference to a packet header of the packet.
 13. The nodeof claim 9, wherein said first traffic type comprises non-configurationtraffic and said second traffic type comprises configuration traffic.14. The node of claim 13, wherein following transmission of packets ofconfiguration traffic said network chip removes said marking of saidport.
 15. The node of claim 14, wherein said network chip, responsive totransmission of said packets of configuration traffic, alters a nodeidentifier used in packet routing.
 16. The node of claim 9, wherein saidnetwork chip marks said node automatically marking if said port isunconnected at initialization of the system area network.
 17. A systemarea network, comprising: a plurality of interconnected nodes includingat least one node according to claim
 9. 18. A network chip for a node ina system area network including a plurality of nodes, said network chipcomprising: a port for inter-node communication; means for marking theport to prevent transmission to another node of packets of a firsttraffic type while permitting transmission to another node of packets ofa second traffic type; and means for, if said port is marked, routingvia said port only packets not of said first traffic type.
 19. Thenetwork chip of claim 18, and further comprising a routing table thatassociates said port with node identifier of at least one of saidplurality of nodes, wherein said means for routing routes packets byreference to said routing table.
 20. The network chip of claim 19,wherein said means for marking comprises means for marking said port bysetting a port configuration register.
 21. The network of claim 18,wherein said network chip comprises means for determining a traffic typeof a packet by reference to a packet header of the packet.
 22. Thenetwork chip of claim 18, wherein said first traffic type comprisesnon-configuration traffic and said second traffic type comprisesconfiguration traffic.
 23. The network chip of claim 22, and furthercomprising means for, following transmission of packets of configurationtraffic, removing said marking of said port.
 24. The network chip ofclaim 23, and further comprising means, responsive to transmission ofsaid packets of configuration traffic, altering a node identifier usedin packet routing.
 25. The network chip of claim 18, wherein said meansfor marking comprises means for automatically marking said port if saidport is unconnected at initialization of the system area network.